System-level trends now integrate embedded functionality onto growing numbers of system-on-a-chip (SOC) designs, including memory, CPUs, microprocessors, analog interfaces, and DSPs. These large integrated designs also continue to grow more and more complex, harder to verify and test, and much more expensive to take from design to mask set to silicon. At the same time, the SOC design community is coming under increasing pressure to shorten design cycles and time to market.
The one functional element missing in the SOC as it evolves onto a single piece of silicon is the field programmable gate array (FPGA). For years, FPGAs have been helping to resolve time-to-market issues and ease design complexities and related problems such as changing standards. System integration is part of the natural evolution of electronics and so it's inevitable that embedded FPGA reconfigurability should offer a natural addition to the emerging standard-cell SOC.
Actel Corp. (Sunnyvale, CA) created an embedded FPGA business unit to market reprogrammable star IP cores to the ASIC and ASSP manufacturers. Using their experience in the PLD market, this unit has developed and is now introducing a new set of IP cores, the VariCore embedded programmable gate arrays, or EPGAs. These cores constitute a family of reprogrammable embedded logic that should help to cut SOC development costs, get ASIC and ASSP SOC product applications to market sooner, and keep the products on the market longer (see Figure 1).
Standard-cell SOC designs
There are three general benefits that can be derived from having embedded reconfigurability available to the system-level designer of ASICs or ASSPs: risk reduction, design flexibility, and improved design security. All three converge to enable the ultimate business goal of getting design-secure products to market earlier and in a more cost-effective manner. In addition, reconfigurability allows these products to remain competitive in the market longer.
Mask-set costs are expected to run over $1 million as process geometries move to 0.13 micron and smaller; as design complexity increases, so does the physical complexity of silicon production technology. Beyond contributing to the cost of missing a market window because of design changes and silicon re-spins, the enormous dollar cost of additional non-recurring engineering costs (NREs) and re-spins can push total unit costs beyond acceptable levels.
At the lower end of SOC design, there's increasing pressure from standalone FPGAs, whose continuing value proposition is faster time to market. Today, dedicated FPGAs are getting bigger and adopting "platforms" to facilitate SOC functionality. In this environment, reducing the design risk and total cost of ASICs and ASSPs is essential to the bottom line. Risk reduction in the design cycle is absolutely necessary to help speed complex chip-level systems from design to the marketplace.
Adding embedded reconfigurability to system-level standard cell designs will make it far easier than ever before to implement design iterations and cope with changing standards.
The flexibility of an embedded reprogrammable core can help to debug complex designs anywhere in the design cycle, substantially reducing the risk of additional NRE and mask-set costs. In fact, changing design features or unstable industry standards can be efficiently and safely iterated in parallel with silicon tape-out, saving time to market. This would virtually eliminate the risk associated with such a bold maneuver in a "non-reconfigurable" SOC environment and negates the need to use the other less-than-efficient, software editing techniques, which often result in compromises to system performance.
With reprogrammability added to standard-cell SOCs, versions and variants become a real capability for product development. An entire product family can be iterated from a single base design and mask set with blocks of reprogrammable cells included. This capability could eliminate the re-spins required today for version and variant products, where minor changes to a feature set or bus interface can differentiate an entire product line. Common examples of version products are printers, wireless handsets, and networking equipment where product features remain the same, but different bus interfaces are required. Examples of variant products include copiers, LCD controllers, and automotive electronics. Product life cycles can be extended by programming design changes in reprogrammable EPGA logic on the fly after production has commenced or even after the product is in the field.
Reprogrammable logic resident in an SOC device means new features can be added, and changed standards can be implemented to finished products simply by reprogramming reprogrammable blocks on the SOC device. This means that the shelf life of nearly any product could be extended and new product development costs could be reduced with the addition of embedded FPGA cores to ASICs and ASSPs.
For system-level designs, reprogrammable logic embedded within the SOC itself rather than off chip as a dedicated FPGA dramatically reduces the system's vulnerability to reverse engineering. There may be no I/O pins and, therefore, no discernable access from off-chip to embedded reprogrammable cores.
Three principal product criteria
Feedback from potential users says there are three primary measurement criteria an embeddable FPGA core must pass to be considered for use in an SOC device. The first measure is the die area consumed to achieve on-chip reconfigurability. The second measure is the ratio of performance to die area, which must be superior to a dedicated standalone FPGA or embedded processor. Lastly, an embedded FPGA core must produce a better performance-to-power ratio when compared to standard software IP alternatives.
To be truly efficient in an SOC application, FPGA cores need to be designed specifically to fit in a cell-based environment. The Actel EPGA cores use a LUT (look-up table) architecture, which employs a five-transistor SRAM cell structure instead of the usual six-transistor configuration, producing the smallest FPGA SRAM cells currently available. The largest logic block of the 0.18-micron EPGA family is a 40,000 ASIC gate core, 20 mm square, which meets the first screening criteria above related to die area.
As for the second measure, the 0.18-micron EPGA family achieves clock speeds of 250 MHz, which allows it to meet on-chip speeds of 50 to 100 MHz without sacrificing die area. This level of on-chip performance matches well with the FPGA performance requirements for the majority of SOC designs we are seeing today. When coupled with the smallest functional die area, 0.18-micron EPGA cores definitely meet criteria number two related to performance.
At 0.075W/MHz/ASIC gate, the EPGA silicon architecture meets the third criteria relating power to performance; it burns one-third to one-half the power of an equivalent standalone SRAM FPGA device. For example, on-chip applications with 80-percent utilization running at 50 to 100 MHz having power consumption between 100 to 200 mW. This is quite acceptable when compared to an off-chip dedicated SRAM FPGA with the same functionality and the same percentage of utilization. The VariCore design tool available with the core employs clock-gating techniques to disable unused registers and logic and generates an EPGA configuration bitstream optimized for low power.
The core architecture
An EPGA core is essentially an FPGA function that is implemented as an intellectual property (IP) core rather than as a standalone silicon device. Actel's EPGA cores are implemented in 0.18-micron SRAM technology and are being brought up on processes at UMC, Chartered Semiconductor, and TSMC, respectively. The 0.18-micron family of EPGA cores are available in densities from 5,000 to 40,000 ASIC gates. This is equivalent to more than 206,000 FPGA system gates. EPGA cores are scalable and configurable and can be organized into different horizontal and vertical shapes and sizes.
However, some designs may require less conforming shapes. In those cases, EPGA shapes can be customized to fit the specific application. EPGA cores are made up of primary embedded gate (PEG) blocks, each the equivalent of approximately 2,500 ASIC gates or 10,000 system gates. Each PEG contains 256 registers and 512 LUTs, with 40 inputs and 40 outputs per external register edge. For example, the 4x4 V18L4x4 core contains 16 PEGs with a total of 4,096 registers, 8,192 LUTs and 640 inputs and outputs. Each PEG block consists of smaller function group (FG) blocks, which contain the EPGA logic units (see Figure 2). There are 1,024 FGs in a V18L4x4 EPGA core.
Various EPGA blocks are available with optional on-board blocks of RAM. The RAM operates in a synchronous mode with an optional additional read data pipeline. For FIFO support, the RAM includes flag logic.
Each EPGA array has eight global clock networks with a clock skew of 0.25 nanoseconds. All eight globals connect to all adjacent functional groups and each can drive any register control line by either external I/O or an internal gate. Some applications require a large number of clocks and the EPGA architecture permits the creation of one additional clock per PEG.
Design methodology and flow
To expect acceptance by a broad spectrum of the ASIC design community, embedded IP must integrate closely with a standard ASIC design methodology and flow. VariCore EPGA cores have adopted a standard ASIC methodology and flow. In addition, these cores use a standard FPGA design methodology, which is fully integrated into the ASIC flow to create a parallel dual design flow model (see Figure 2). This methodology permits predictable interface and support to industry standard third-party design, verification, and test software. The EPGA function of the design flow includes RTL coding, synthesis, place and route, and verification.
Users of VariCore reprogrammable cores will receive a GDSII database containing pin list and placement, top-level timing models, routing obstructions, physical core footprint, power and ground, and physical netlist for LVS (layout versus schematic). The cores have five layers of metal; the fifth metal layer can be used for routing. For physical design, files output in hard GDSII IP flows are compatible with Cadence's Virtuoso and Avanti's Apollo. Integration of the core into the system-level design includes the GDSII block, test generation, and physical verification-or possibly even futuristic "evolvable hardware" solutions.
EPGA cores have pin-fixing capability; enabling interface to neighboring blocks. Signals to EPGA cores can be pre-designated and each core contains four primary core-specific interfaces: power control, configuration, JTAG, and built-in self test (BIST). These interfaces provide configuration and test access to the EPGA array.
Actel's high-speed VariCore Compiler design tool supports design entry from Verilog or VHDL netlists. Synthesis support is provided by Synopsys Design Compiler or similar synthesis tools. The compiler also generates data files for use with Synopsys' PrimeTime and PrimePower performance and power simulation tools. Compiler outputs result in a bitstream, post-layout SDF timing file and post-layout Verilog or VHDL netlists (see Figure 3).
Place and route is also performed by the compiler, averaging 10,000 ASIC gates per minute-four minutes for the largest 40,000 ASIC gate 4x4 core.
The same core, once embedded in SOC silicon, can be programmed in 23 milliseconds, an acceptable time for possible in-system reconfiguration applications. The compiler is also supported by third-party design entry, verification, and test tools.
The toolset provides an automatic flow from the structural netlist to the final programming bitstream. The flow includes the following sequence. The netlist is verified for legality. The design is optimized-redundant logic is removed and inverters are moved to the look up tables. The design is partitioned into PEG blocks and then into functional groups. All functional groups are routed to implement the design. Post-layout timing is extracted. The programming bitstream is generated. To support timing closure, the compiler reads SDC timing constraint files and uses a timing-driven partitioning and routing algorithm optimized for minimum power consumption.
Test and verification
Effort has been made to ensure that VariCore EPGA blocks can be tested and verified after being embedded within the complete SOC design. EPGAs are observable and controllable with BIST features, pin-fixing capability, and up to 1,280 IO ports per core.
A BIST interface allows an optional, external BIST controller to be connected to the EPGA core. This controller is not used to achieve manufacturing test coverage, but allows users to run BIST on their own application circuits within the EPGA array. This may be required in an SOC environment that is needed to implement BIST or provide self-test capability.
The interface allows the external BIST controller access to the scan chains within the IP block. These scans allow all of the user registers and I/O locations (boundary scanned) within the core to be loaded and sampled by a BIST controller. The number of scan chains varies depending on the core size, a 4x4 array has 16 internal and eight boundary scan chains.
For physical design verification, LVS, and design rule checks (DRCs), the compiler supports Cadence's Dracula and Avanti's Hercules II. To enable verification of the EPGA core within the complete SOC environment, the compiler generates "enhanced" EPGA netlists and models that allow the complete EPGA function, including control interfaces and internal scan chains, to be verified within the SOC. An IEEE 1149.1-compliant JTAG interface is used for configuration, boundary scan, probe access, enabling of test modes, and enabling of BIST.
Using the JTAG interface, it is possible to probe the contents of the EPGA array and display the results on the user's design.
A debugger also allows test vectors to be applied to the user circuit and verified through its JTAG interface. Finally, the Actel-supplied manufacturing test vectors can be re-run on the EPGA within the SOC circuit using the JTAG interface.
Design-testing access is through the JTAG interface and uses a five-stage approach with JTAG vector sequencing supplied with the core. The five stages test the following: EPGA configuration memory, EPGA routing resources, EPGA RAM cells, EPGA logic cells, and external I/Os to and from the EPGA.
Rapid prototyping A VariCore developer's kit includes an emulation board with an 0.18-micron EPGA development chip (produced on the UMC CMOS process) and co-design support.
Actual FPGA power consumption has always been very difficult to measure; it becomes even more imperative to gain accurate power measurements if the programmable core is integrated and buried within system silicon. The emulation board is also able to perform dynamic self-testing and generate real-time performance and power consumption data.
The VariCore embeddable reprogrammable cores give system-level designers an alternative to dedicated off-ship FPGAs. EPGA cores should enable ASIC and ASSP providers to get their products to market faster, cheaper, and with less risk. Embedded programmable logic will help keep products on the market longer.
Yankin Tanurhan, Ph.D. is the director of Actel Corporation's (San Jose, CA) embedded FPGA group. Prior to joining Actel, Tanurhan was senior manager of technology partnership programs for Synopsys. He has also held research and engineering positions at FZI-Forschungszentrum Informatik, Institute of Computer Aided Circuit Design and Informatik Forum in Germany.